LSU Health New Orleans Newsroom

LSU Health New Orleans School of Public Health Faculty Develops Two Novel Statistical Approaches to Detect Interactions of Genetic Variants Associated with Cancers

dna technology

Research conducted by Hui-Yi Lin, PhD, Associate Professor of Biostatistics at LSU Health New Orleans School of Public Health, developed two novel statistical methods for detecting interactions of genetic variants associated with cancers or other complex diseases. These two methods are published in Bioinformatics, one of the leading journals in the field.

“Selecting a good statistical method is essential to conducting a solid study,” notes Dr. Lin. “It is like choosing a good recipe for preparing a meal. If the recipe is bad, even good ingredients won’t result in a good meal. The same concept can be applied to genetic association studies."

During the past decade, genome-wide association studies (GWAS) have successfully identified many inherited genetic variants or single nucleotide polymorphisms (SNPs) associated with cancers or other complex diseases. SNPs, pronounced, Snips, are genetic variants in single DNA building blocks called nucleotides. For example, when one nucleotide is replaced by another inside a gene or in a region that determines gene function, disease can develop or worsen. According to the US Library of Medicine, “SNPs occur normally throughout a person’s DNA. They occur once in every 300 nucleotides on average, which means there are roughly 10 million SNPs in the human genome.”
“However, the predictive power for individual genetic variants and the genetic risk scores built based on the GWAS-identified SNPs on cancers is still limited,” Lin says. “It has been shown that gene-gene/SNP-SNP interactions play an important role in the etiology of complex diseases. Although SNP-SNP or gene-gene interaction studies have been emerging, the statistical methods for evaluating SNP-SNP interactions are still underdeveloped.”

The conventional approach to test SNP interactions is to use a hierarchical interaction model with two main effects plus their interaction with both SNPs as an additive inheritance mode.

Hui-Yi Lin, PhD
“This approach just tests one specific type of interaction, so will lead to many false negative findings,” adds Lin.

To identify significant SNP-SNP interactions, Lin developed two statistical methods – the SNP Interaction Pattern Identifier (SIPI) and the Additive-Additive 9 Interaction-model approach (AA9int). The SNP Interaction Pattern Identifier (SIPI) approach evaluates 45 SNP interaction patterns by considering three major factors: model structure (hierarchical and non- hierarchical model), genetic inheritance mode (dominant, recessive and additive), and mode coding direction. Her study demonstrated that SIPI can detect novel SNP interactions, which cannot be detected using the conventional statistical approach. These interactions can predict disease outcome better than individual SNPs.

“SIPI is statistically powerful but suffers from a large computation burden,” says Lin. “It requires tremendous computational resources.”

dna and proteins
To overcome this computation issue, Lin recently proposed AA9int, which is an evidence-based mini-version of SIPI. The AA9int approach is composed of 9 interaction models by considering non-hierarchical model structure and the additive mode. The simulation study showed that AA9int detected 72-92% SNP pairs but used just 20% of computing time compared with SIPI. In large-scale studies, AA9int is an efficient and effective tool to be used either alone or as the screening stage of a two-stage approach (AA9int+SIPI) for detecting SNP-SNP interactions.
Lin’s methods also allow users to input the candidate ‘pairs' or candidate SNPs. This distinct feature can significantly reduce the amount of computation time for limiting analyses on candidate SNP ‘pairs' instead of all possible pairs of candidate SNPs.

Lin applied these two methods to identify SNP-SNP interactions in the angiogenesis pathway associated with prostate cancer aggressiveness using the data from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium cohort, the largest prostate cancer consortium in the world and the supercomputer from the Louisiana Optical Network Infrastructure (LONI).

“Applying SIPI to the prostate cancer PRACTICAL consortium data with approximately 21,000 patients, we found that the four SNP pairs in EGFR-EGFR, EGFR-MMP16, and EGFR-CSF1 were associated with prostate cancer aggressiveness with the exact or similar pattern in the discovery and validation sets,” Lin says.

Lin's study findings demonstrated that SIPI can detect more meaningful interaction patterns compared to the conventional approach or other existing methods.

“The SNP interaction pairs, identified using these two novel approaches (AA9int or SIPI), can be applied to build risk prediction models or genetic risk scores for cancers or other complex diseases,” she adds. “These identified gene-gene or SNP-SNP interactions can provide insight to understand the biological mechanism of cancer development and may improve cancer diagnosis accuracy and reduce cancer-related deaths in the future.”

Lin is the principal investigator of an NIH/NCI-funded R21 grant to study gene-gene interactions associated with prostate cancer aggressiveness. These two new statistical methods are being applied to analyze genetic data for this project. The preliminary results are promising.